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Abstract 

We examine the degree to which fluctuating dynamics on exponentially expanding 
spaces remember initial conditions. In de Sitter space, the global late-time config- 
uration of a free scalar field always contains information about early fluctuations. 
By contrast, fluctuations near the boundary of Euclidean Anti-de Sitter may or may 
not remember conditions in the center, with a transition at A = d/2. We connect 
these results to literature about statistical mechanics on trees and make contact with 
the observation by Anninos and Denef that the configuration space of a massless dS 
field exhibits ultrametricity. We extend their analysis to massive fields, finding that 
preference for isosceles triangles persists as long as A_ < d/4. 
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1 Introduction 

The exponential expansion of de Sitter space tends to wash away information about initial 
conditions pQ. This cosmic no-hair principle, which has both classical j2] and quantum 
mechanical [3] versions, gives inflation [4] predictive power, and may do the same for 
eternal inflation [3]. Cosmic no-hair can be paraphrased as follows: expectation values 
of quantities defined in a fixed number of regions of fixed proper size forget the initial 
conditions at sufficiently late time. The "fixed" qualifiers are important. Local quantities 
forget initial conditions, but global quantities, such as integrals of fields over the entire 
spatial slice, may not. 

A closely related question has been studied thoroughly in the context of statistical 
mechanics on trees. The prototypical example, reviewed in [S], is the Ising model on an 
infinite tree with free boundary conditions. This system can be written via transfer matrix 
as a Markov problem, and an analog of cosmic no-hair follows from Markov convergence. 
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The more interesting question, known in the literature as the "reconstruction" or "broad- 
cast" problem, is whether the probability distribution for a global quantity, such as the 
total magnetization of the spins on the leaves of the tree, depends on the value of a spin 
at the root. As it turns out, there is a phase transition. Below a critical T c , the majority 
vote of spins at infinity tends to coincide with the root [8], while, above T c , the joint 
distribution for root and boundary spins is exactly a product: all memory is washed away 

Our first purpose in this paper will be to understand the analog of this transition, in 
de Sitter space [TO] and Euclidean Anti-de Sitter space. As we'll see, the temperature 
T c from the Ising model corresponds to a critical value A c = d/2 for the dimension that 
characterizes the falloff of correlation functions near the boundary of dS or EAdS. It is 
well known that A_ for free fields in de Sitter never falls below d/2. Indeed, we'll see that 
such fields always remember the initial conditions. 

We'll be more quantitative later, but for now, "remembering the initial conditions" 
simply means that, as the cutoff tends to infinity, a finite correlation remains between 
some function of the late-time configuration and early-time fluctuations. In fact, this 
correlation can be very large. Nonperturbative bubble nucleation can be characterized 
by a collection of highly non-Gaussian fields that keep track of the local vacuum index 
[TTj . In the case where the nucleation probabilities are exponentially small, a late-time 
configuration will contain enough information to accurately reconstruct the entire history. 

In contrast to the situation in de Sitter, free fields in EAdS can fall off faster or 
slower than d/2, depending on the choice of standard (no memory) or alternate (memory) 
boundary conditions [12]. The fact that this can go either way means that the memory 
phenomenon should have an interpretation purely in CFT terms. Indeed, one way to make 
the analogy is to consider a Euclidean CFT perturbed by a relevant operator at infinity. 
The existence of memory becomes the question of whether the statistics of functions of an 
order one fraction of all UV degrees of freedom are sensitive to the infrared perturbation. 

Memory can also be understood in the fixed point CFT, or unperturbed (EA)dS, as a 
sort of failure of the central limit theorem. Limiting late-time global quantities in (EA)dS, 
or functions of all the UV degrees of freedom in a CFT, involve an infinite number of 
variables. If these variables are sufficiently independent, the distribution for the global 
quantity must be Gaussian. Generally, we'll see that memory of initial conditions translates 
to non-Gaussian statistics for these global quantities. In the language of Gibbs states, this 
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is related to the existence of multiple extreme componentsjj 

The presence of multiple Gibbs states in the Hartle-Hawking ensemble was recently 
pointed out by Anninos and Denef [13J, who analyzed a massless field in dS and demon- 
strated the existence of multiple extreme states. They went on to compute probability 
distributions for the distances between three field configurations, independently drawn 
from the Hartle-Hawking ensemble, and found a non-Gaussian, somewhat ultrametric dis- 
tribution of distances. Here we will extend their analysis to positive mass fields in de 
Sitter, finding that the Anninos-Denef ultrametricity extends to massive fields as long as 
A_ < d/A, but that the overlap distributions become Gaussian beyond this point. 

The plan of this paper is as follows. In section [2j we will examine the question of 
memory in five different systems: first reviewing literature about the Ising model on a 
tree, then moving on to Gaussian fields on a tree, free fields in EAdS, free fields in dS, and 
general CFTs. In section |3j we will discuss the space of extreme states. We'll give a rather 
explicit description for the Ising model on a tree, before going on to extend Anninos-Denef 
ultrametricity to positive mass fields in de Sitter. 

2 Memory of initial conditions 
2.1 Ising model on a tree 

Like many concepts in statistical mechanics, memory can be illustrated most simply with 
the Ising model. Consider the system on a (rooted) regular tree of degree (p + 1), defined 
by associating a classical spin variable Sj = ±1 to each site, and taking the Boltzmann 
weighting with the usual pairwise Hamiltonian, 



For any finite subsystem of the tree, the number of boundary spins is of order the number 
of total spins, so we have to be careful about specifying the boundary conditions. For now, 
we will consider free boundary conditions. 

We can state the criterion for the memory as follows. Write for the collection of 
spins at generation u (see figure |3|) in the tree, and consider the mutual information 



1 We will use the term "extreme state" instead of "pure state." And we mean extreme states in the 
tree/bulk, not in the CFT. 





links ij 
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between the spins at generation u and at the root. If this quantity approaches zero as 
u tends to infinity, then conditioning on the value of the root spin doesn't change the 
distribution for the spins at u — > oo. We will describe this situation by saying that the 
system forgets the initial conditions. On the other hand, if the mutual information is 
bounded from below by a positive constant as u tends to infinity, then there is correlation 
between the spin at the root and the collection of spins at u — > oo, and we will say that 
there is memory in the system. 

=3 
=2 

u=l 




u=0 



Figure 1: The time u in a p = 2 tree. 

A priori, it might have been the case that the Ising model either always has memory, 
or never does. Here, we will give a heuristic argument that a transition happens at a 
finite value of the temperature, leaving the proof to the literature [H [7J [9]. To make 
the argument, and to relate the Ising model to our other models, it will be very useful 
to rewrite the statistical mechanics problem defined by eq. ([TJ as a branching Markov 
process. 

This is a standard application of transfer-matrix logic. The basic point is that the 
probability distribution for a spin at generation (u + 1) depends on the spins at earlier 
generations only through the parent at generation u. Combined with the free boundary 
conditions at infinity, this means that if parent spin is up, the child is more likely to be 
up than down by a ratio of Boltzmann factors e 2//T . More generally, we can write the 
probability distribution for the spin of the child as a two component vector, obtained from 
that of the parent by a Markov matrix 

g^C 1 : 7 , 7 V (3) 



7 1 -7 . 

where the "flip probability" 7 is given in terms of the temperature by the equation 

P(child = parent) 2 j T 1 — 7 
P(child 7^ parent) 7 
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To obtain the full probability distribution for all spins at generation u + 1, one takes 
the distribution for the parent spins at generation u, assigns each parent two children, 
and applies the matrix G independently for each child. In particular, the evolution of 
the probability distribution along a given path through the tree is an ordinary Markov 
process. If we start out the root with spin up, the probability distribution for a given 
child at generation u will be evenly split between up and down, up to an exponentially 
decaying transient, proportional to X u , where A = 1 — 27 is the second eigenvalue of the 
Markov matrix G. It follows that mutual information between the root and a given spin 
at generation u tends to zero exponentially. 

Instead of looking at a particular spin, let us consider the total magnetization at 
generation u 

p u 

m u = y] (5) 

i=l 

It is easy to check that if the root starts with spin up, the expected value of M u is (pX) u , 
which grows exponentially with u. One might be tempted to conclude that the total 
magnetization always remembers the spin of the root, but we need to be more careful. 
The right thing to do is to compare (p\) u with the typical width of the distribution 
for M u . Suppose that the system has no memory. Then distant spins at generation u 
will be independent, and the variance should be proportional p u , the number of spins at 
generation u. We see that the assumption of no memory is consistent only if the bias in the 
expected total magnetization (p\) u is much smaller than the square root of the variance, 
p u l 2 . Setting these two equal gives us the critical value of A 

Ac = \- (6) 

If the second eigenvalue is larger than this value, the probability distribution for the 
total magnetization remains meaningfully biased in the direction of the initial spin. It 
follows that the mutual information eq. ^ cannot tend to zero at large u: the system 
has memory. On the other hand, if the eigenvalue is smaller than then the bias 

becomes small compared to normal fluctuations, so the mutual information between M u 
and the initial spin tends to zero as u grows large. Our description here was intuitive, 
but the result was proven rigorously in [7], using the previous work [8j. One can see the 
transition, as a function of A, quite clearly in figure [2j 

2 This is the so called "spin-glass" transition point for the Ising model. The "Ising" transition point 
at A = l/p is a higher temperature transition, related to the ability of the root to detect a uniform up 
boundary condition at infinity. 



5 




Figure 2: Sample configurations of the spins in the p = 4 tree after six generations, given 
a white initial condition at generation zero. The left panel is firmly on the "memory" side 
of the transition, with 7 = 0.05. The middle panel is at the transition 7 = 0.25, and the 
right panel is forgetful, at 7 = 0.4. 

There's another way of understanding memory, as a breakdown of the central limit 
theorem. If we assume that the the magnetization M u is correlated with the value of 
the initial spin, then (at least at small 7), the distribution for M u ought to be bimodal, 
and therefore non-Gaussian. On the other hand, if there is vanishing correlation between 
M u and the initial spin, then the tree structure implies that there must also be vanishing 
correlation between different patches of the boundary leaves. This means that, in the limit 
u — y 00, the random variable M u is a sum of an infinite number of independent random 
variables. The central limit theorem ensures that it will have Gaussian statistics. 

So far, we have considered the forgetfulness of two quantities: a given spin at late u, and 
the total magnetization M u . The former never maintains memory of the initial condition 
at late time, while the latter does so if and only if the second eigenvalue A is greater than 
1/a/P- What about some other quantity? Perhaps, by constructing a more complicated 
function of the spins at generation u, one could find a variable that remembers the initial 
condition, even for A < l/ v /p. For the Ising model, this turns out not to be the case. It 
was proven in [9] and [6] that the mutual information itself tends to zero at late time if 
A < 1/ y/p. For more general Markov processes on trees, the exact condition is not known. 
The "second eigenvalue condition" A > 1/ y/p, also known as the "Kesten-Stigum bound" 
is known to be sufficient for the existence of memory, but it is not in general necessary 

Let us begin to translate these Ising results into the language of de Sitter space. As a 
first step, we can use the symmetree framework [TT], which models the statistics of bubble 
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nucleation in eternal inflation using Markov processes on trees. The basic idea is that 
the exponential expansion of the tree mocks up the geometry of de Sitter. Each vertex 
is assigned a "color", corresponding to the vacuum type at a given spacetime point, and 
these colors undergo stochastic transitions, described by Markov dynamics with some rate 
matrix G. The Ising model considered above corresponds, in this language, to eternal 
inflation with a symmetric two-vacuum landscape, and tunneling probability (per Hubble 
four-volume) 7. Since transitions between vacua are supressed by tunneling factors, we 
should take 7 to be exponentially small, making the second eigenvalue (1 — 27) extremely 
close to one, and putting us firmly on the side of memory. 

For small 7, the most likely way for reconstruction to fail is for a transition to happen 
in the first generation. The probably for this is order 7, so the probability of failure is 
suppressed by the tunnelling rate. For a bubble-nucleation setup with exponentially small 
tunneling rates, the late-time configuration will contain an exponentially good record of 
the initial condition. In fact, the late-time configuration will contain an exponentially 
good record of the entire history, since later vertices can be considered roots of the smaller 
trees growing out of them, and the same reconstruction procedure can be applied. The 
scale invariance of the system implies that if any memory exists, an infinite amount does. 

It will be useful, in making the correspondence with the following sections, to under- 
stand the relation between eigenvalues of the rate matrix G and dimensions of fields in de 
Sitter space. First, we need a relation between the time variables. A (p + l)-regular tree 
has a spatial volume that increases with generation as p u , while the volume of de Sitter 
increases with proper time t as e td ^ dS . This gives us the identification 



Next, we would like to relate the eigenvalue of a Markov process to the scaling dimension 
of the corresponding field. Correlations of operators corresponding to eigenvectors of the 
Markov matrix fall off with proper time as X u . We identify this with the de Sitter behavior 
e -A-t/e dS _ Combining this with eq. ([7]), we find 



A_ = — dlogp A. 




The critical value of the Markov eigenvalue A = 1/ y/p therefore corresponds to a scaling 
dimension A_ = d/2. A similar identification can be made in Euclidean AdS. 
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2.2 Gaussian model on a tree 



In the previous subsection, we reviewed the memory transition for the Ising model on a 
tree, and discussed some implications for the symmetree model of eternal inflation. Before 
moving on to de Sitter and Euclidean Anti-de Sitter, we will construct a useful stepping 
stone: the massive Gaussian field on a tree. This model closely resembles free fields in 
(EA)dS, but the tree geometry will allow us to make contact with the Ising memory results, 
the second-eigenvalue condition in particular. 

A massless Gaussian field on a regular (p + 1) tree was previously studied by Zabrodin 
|15j . We will focus on the p = 2 case, but add a mass. To define the model, associate to 
each vertex a of the tree a real field variable <f)(a), and form the Bolztmann ensemble with 
temperature T = 1 and Hamiltonian (or Euclidean action) 

ff=^w 2 + y E ( g ) 

links vertices 

Just like the Laplacian on hyperbolic space, the discrete tree Laplacian has a gap in the 
spectrum, so the statistical mechanics of this system makes sense with somewhat negative 
mass-squared. For the p = 2 tree, the gap is 3 — 2\/2, so the action is positive as long as 
m 2 > 2\/2 — 3. When m 2 is negative, we will call it — fj 2 . 

To get a feel for this system, we can study the equations of motion, obtained by 
differentiating with respect to at a particular vertex a in the tree. This gives 

30(a) - 0(childi) - 0(child 2 ) - (^(parent) + m 2 (j)(a) = 0. (10) 

If we take a homogeneous ansatz <ft ~ X u , where u is time in the tree, we find two solutions, 

3 + m 2 ± Vl + 6m 2 + m 4 
A± ~ A ■ (11) 

For small m 2 , the solutions are A + = 1 +m 2 and A_ = (1 — m 2 )/2. A + is greater than one 
for positive m 2 , so one of the solutions grows exponentially in the direction of the tree's 
branching. On the other hand, if m 2 < 0, then both branches are less than one. 

It is well known that boundary conditions can be extremely important for statistical 
mechanics on trees. The reason is that the boundary makes up an order one fraction of 
the tree, for any cutoff. To be careful, we will treat the boundary vertices separately, 
modifying the above Hamiltonian to 

ff^EW 2 + y E ^ + f E ( 12 ) 

links bulk vertices bdry vertices 
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where, in general, vtiq ^ m. In order to preserve the symmetry of the tree, we would like 
mg to be independent of the cutoff, in the sense that integrating out the boundary vertices 
leaves an action for a smaller region, with the same value of mg. It is easy to show that 
this condition allows two solutions as a function of m 2 . They can be conveniently written 
in terms of the solutions for A± as 

ml = (13) 

For for the special case of m 2 = 0, the solutions are m 2 d = 0, corresponding to free boundary 
conditions, and m 2 a = 1, corresponding to a condition that tends to suppress fluctuations. 
More generally, we will see that the "-" branch closely parallels the standard boundary 
conditions in EAdS, while the "+" branch resembles alternate boundary conditions. 

Much like the Ising model, this system can be recast as a branching Markov random 
field. The rate matrix G along each link of the tree is a Gaussian kernel 

G (^ Vexp {_(^>!_^ + ^}. (14) 

where we'll derive a and (3 below. This kernel assignes the probability distribution for the 
field value of a child vertex at generation (u + 1) in terms of the probability distribution 
for the parent at generation u, via 

Pu+M = J G{4>A')Pu{<t>')d<t>'- (15) 

To identify the correct values of a and (3, we require that the infinite product of G(4>, (f)') 
along each link in the graph should equal e~ H . This means that we need a — 2(5 = m 2 . 
Another constaint comes from requiring the probability to stay normalized, 

G((j),(f) / )d(j)=l. (16) 

Doing the Gaussian integral, this means we need (5 = a/(l + a). These two equations 
determine a and (3 in terms of m 2 . The equation is quadratic, so we have a choice of two 
solutions. Again, these can be parameterized in terms of \± as a = (1 — A±)/A±, and 
(3 — 1 — A±. Since the boundary weighting implied by the Markov kernel G is m 2 d = a, we 
see that the upper/lower sign choice here is the same as the corresponding choice for the 
branch of m 2 d . 

We conclude that the properly normalized Markov kernel is 

G(0 ' 0) = ^: exp {-^ 2A^ + — 4 )• (17) 



± 
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Our lesson from the previous section was that, to look for memory, we should compute 
the second eigenvalue of G. While G isn't a symmetric matrix, it does satisfy detailed 
balance, i.e. G = ZSZ^ 1 , where Z is diagonal and S is symmetric, so we are guaranteed 
to have eigenvectors and real eigenvalues. These are 

G(0,0U(0W = ^/„(0) 

W) = e^H n {yfa) a = Y + 2^- (18) 

Here, H n is the (physics convention) Hermite polynomial. One can easily check the above 
using the representation H n (x) = § -j0^e~ t2+2xt and using Gaussian integration under 
the contour integral. 

Now that we know the spectrum of G, there are a few points to be made. First, 
we see that the second eigenvalue, for either choice of boundary conditions, is equal to 
the corresponding falloff \± from the tree equation of motion. This justifies our analogy 
to standard and alternate quantization in EAdS: the decay of correlation functions is 
controlled by the second eigenvalue, which takes the "fast-falloff" value for one branch of 
Trig, and the "slow-falloff" value for the other. Second, if m 2 is positive and we pick the 
"+" branch, the eigenvalues of G are not bounded. This reflects the familiar fact that 
alternate quantization doesn't make sense for positive mass-squared. Here, we see it as 
a breakdown of the normalizability of the Markov matrix. Finally, and most important 
for our purposes, the "+" branch always has a second eigenvalue greater than or equal to 
the critical value 1 / y/2 J^] while the "-" branch always has an eigenvalue smaller than or 
equal to that value. To the extent that this is a faithful model of EAdS, we expect that 
with alternate quantization, simple spatial integrals of free fields near the boundary are 
sensitive to conditions near the center, but that with standard quantization they forget. 
We'll see this explicitly below. 

2.3 Free fields in EAdS 

The regular tree graph studied up to now can be considered a discrete version hyperbolic 
space. In this section, we will test the memory of free fields in continuous hyperbolic space, 



also known as Euclidean Anti-de Sitter. We'll define memory by analogy to section 2.1 
Specifically, if the mutual information between the field variable at some fixed radius i 



'Remember, we have specialized to p = 2 in this subsection. 
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and the collection of field variables near the boundary at radius I stays bounded away from 
zero as i approaches the boundary, then we will say that there is memory. Otherwise, not. 

We will work with free fields, for which the statistics of different modes decouple. This 
makes it possible to look for memory mode-by-mode, asking whether the probability distri- 
bution for a given spatial Fourier component, near the boundary, is sensitive to conditions 
imposed on the same mode deep in the bulk of the space. For simplicity, we'll work with 
the zero mode in the x directions^ Whether or not memory exists depends on the choice 
of boundary conditions. For the "standard quantization" boundary conditions, we will 
find that memory doesn't exist for any allowed value of the mass, while for "alternate 
quantization," we'll find that it does. 

Let us work in Poincare slicing, with metric 

, ri dz I dec dec , , _ > 

ds 2 = i = l,...,d, (19) 

where the anti-de Sitter radius is taken to one, and < z < oo. The action for a massive 
scalar field 0(x, z) is 

1 f d d xdz 



S 



2 / z 



(i 



z^d^Y + z^d^f + m 2 ^ 2 }. (20) 



We will Fourier transform this system in x space, and specify to the k = component, 
which we'll refer to as (p(z). The wave equation for this zero mode, 

^9 d — 1 „ m 2 , , 

%<p d x <p - — <p = (21) 

z z z 

has two independent power-law solutions, z A ~ and z A+ , where 

A± = - (d ± Vd 2 + 4m 2 ) . (22) 

In the standard boundary conditions, we define the path integral by forcing the field to 
zero as ~ z A+ near the boundary, while in alternate quantization, we allow the slower 
falloff cp ~ z A ~ , but require that there be no component going like z A+ for small z |12j . 

To look for memory, we will impose a condition deep in the bulk, ip(z = £') = ip^, and 
then compute the conditional probability for the zero mode at radius £, P(ip£\(f£/). The 
un-normalized conditional probability is computed by a path integral, with standard or 
alternate conditions at z — > 0, and Dirichlet conditions (p(f) = (fe and (p(£) = (fi£. The 



In section 



2.3.3 



we'll consider a general Fourier mode. 
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normalization is adjusted as a function of tpz>. In computing these path integrals, we will 
mimic the approach of [IB] , splitting the computation into a "UV" piece < z < £, and a 
"visible" piece, I < z < £'. Up to normalization, the conditional probability is 

P(<Pe\<Pi>) = ^uvM^viifPttfff)- ( 23 ) 

In this expression, ^uv depends on the standard/alternate boundary conditions, and we'll 
handle the two cases separately below. However, \lV/ is independent of the boundary 
conditions. It is defined as the path integral over field configurations on £ < z < £', with 
<p(£) = (fi and ip(£') = ipe. We can do this Gaussian path integral by evaluating the action 
on the appropriate classical solution, 

^vi(<fe, <-Pe>) ~ exp (-S d ) . (24) 

Using the equations of motion and integrating by parts, the action reduces to two surface 
terms. The exact result for the zero mode is 

2\£ ld £ d ) 2 1 £ 2A £' d - £ d £' 2A J ' 1 ' 

Here and in much of what follows, we will use the AdS convention A = A + and A_ = d—A. 
The (ff, terms can be discarded because they are independent of fe and contribute only 
to the normalization of P((pi\(pt>). Keeping only the first nontrivial term in small £/£', we 
find 

, . (fd-A\ 2 f2A-d\f£\ A I . . 

Vviift, <Pe) ~ exp — — )<p t + [ —rr- \{ Tl \ <pm> • (26) 



2£ d 

If we had considered a general k mode, instead of the zero mode, the expression would be 
similar, but would involve complicated combinations of Bessel functions. We'll have more 
to say about this below. 

2.3.1 Standard quantization 

To compute ^fjjVj let us fi rs t consider standardard boundary conditions, where (f(z) — > as 
z A . The UV wave function is defined as a path integral over field configurations respecting 
this condition at the boundary, and matching ip(£) = ip^. The result is [TT] 

Vuv(<pt) ~ exp { - \ —A (27) 



Taking the product of wave functions as in eq. (23), we get the conditional distribution, 
up to normalization, as 

U2A — d\ ( / £\ A \ 2> i 
2£ d ~)v e ~\£') VeJ ) )' ^ 
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To assess whether this distribution has memory or not, we will fix £' and let I tend to 
zero, asking whether significant correlation remains between the variables tpe and (pi>. It 
is clear from the distribution that the width for <pt is proportional to ~ £2, and the shift 
in the direction of <f£i scales as ~ £ A . Since A is always greater than or equal to d/2, the 
width becomes large compared to the shift, so the variables lose correlation as £ tends to 
zero: there is no memory This is very analogous to the Ising model above the memory 
transition temperature, where the variance in the magnetization grows large compared to 
the bias due to a particular initial condition. 

2.3.2 Alternate quantization 

Next, we consider the alternate boundary conditions, ip(z) ~ z d ~ A . This changes the path 
integral that defines ^uv-i an d we find 

9 UV ~exp{- (i^jtf}. (29) 



With this wave function, we see from (26) that the <pj term will cancel in the product 
^uy^yj. Thus, we need to go back to the ^vi computation and keep a higher order 
terms in £/£'. We find 

*v/~exp — — )$+(——- - ^+ m \ (30) 



2£ d r l \ 2£ d J\£'J Tl \ 
We now take the product with ^w, an d complete the square, freely adding a term pro- 
portional to oc ipj,. The result is 

This time, the shift towards cpy and the width compete: both are order ~ £ d ~ A . It follows 
that the variables maintain an order one correlation, even in the limit £ — > 0. In alternate 
quantization, we conclude that EAdS has memory]^] 

As expected, these results mirror that of the "fast-falloff ' and "slow-falloff" branches 



on the Gaussian tree of section 2.2, and the second-eigenvalue condition discussed in the 



Ising tree of section 2.1| Here, standard quantization implies a falloff with characteristic 



dimension greater than d/2, and doesn't permit memory. Alternate quantization implies 
a falloff slower than d/2, and maintains memory. 



5 One might wonder what happens at the point where A = d/2. There, one can check that both 
"standard" and "alternate" conditions are forgetful, since the width is proportional to £ d / 2 , and the shift 
is proportional to ^ d / 2 /log^. 
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2.3.3 Mutual information 

So far, we have treated only the spatial zero mode and been somewhat binary in the 
distinction between memory and forgetfulness. In this section, we will be a little more 
quantitative. We'll consider a general Fourier component (pk(z), and compute the mutual 
information I((p k (£); y?k(0) between the mode at some fixed radius £' in the bulk of EAdS, 
and the same mode, evaluated at a radius £ that approaches the boundary. 

The mutual information of two random variables is defined as the sum of the differential 
entropies of the marginal distributions, minus the entropy of the joint distribution: 

I(X;Y) = S(X) + S(Y)-S(X,Y). (32) 

Here, the differential entropy S is defined as the integral of — p(x) logp(x), and can be 
positive or negative. To compute the relevant entropies, we need the marginal distributions 
for <Pk(£) and (pk(£'), as we ^ as the joint distribution for both. To compute the marginal 
distribution, we divide the bulk path integral up into IR and UV pieces, and take 

P{^{£)) = U^ uv {^{£))^ir{^{£)). (33) 

Here, J\f is a normalization constant, and the IR wave function is a path integral over 
field configurations that are smooth in the interior of the space, and match onto (pk(£) at 
radius £. As always, we evaluate the various path integrals by evaluating the action on the 
relevant classical solutions, which are linear combinations of the Bessel functions I± u (kz), 
where 

v = A - ^ = + 4m 2 . (34) 

A straightforward but somewhat tedious evaluation gives the marginal distribution for 
alternate boundary conditions as 

and for standard boundary conditions as 

p ^= Me ^{-^ iJt) ( Zke) } (stmdard qu<mt '>' (36) 

The expression for the joint distribution is slightly more complicated. In addition to 
the UV and IR wave functions, one has to compute a VI wave function that implements 
the path integral over £ < z < £', and then take the product of all three, 

P (v?k (£) , y? k (0) = (vM) *vj (£) , M?')) *m MO) • (37) 
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This joint distribution is Gaussian, but the covariance matrix is an unpleasant combination 
of Bessel functions. The mutual information, however, simplifies rather nicely. For the 
alternate boundary conditions, we find 

^W;^) = -^( 1 - t(B^ ( S ) (a ' q ' ) ' (38) 

while for the regular boundary conditions, we get 

/Wfl;^) = Xi-^||) MO. (39) 

In the limit of small £, the mutual information for regular boundary conditions tends to 
zero as [k£) 2v ', as shown in the right panel of figure [3j On the other hand, with alternate 
boundary conditions, we find a nonzero limit 

/(^ k (0)^ k (O) = ^log [j^) MO- (4°) 

This mutual information is plotted as a function of v in the left panel of figure [3| There 

alternate , . alternate and standard 

l.oi 1 1 1 1 1 ji.Di 1 1 1 1 




Figure 3: (Left) the mutual information I^ip^i); ipk(i')) for alternate quantization, con- 
sidered as a function of v. In this plot, £ has been taken to zero, while k£' = 0.1 in the 
top line, and 0.5 in the bottom line. (Right) the mutual information for both boundary 
conditions, with v fixed at 0.5 and k£' fixed at 0.1, plotted as a function of k£. The upper 
line is alternate, and the lower is standard. 

are two interesting points as a function of v. First, at v — 0, which is equivalent to 
A = d/2, the mutual information is exactly zero, consistent with the zero-mode analysis 
above. Second, the mutual information also vanishes at the lower end of the alternate 
quantization window, v = 1, which corresponds to the unitarity bound A_ — (d — 2)/2. 
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2.4 Free fields in dS 



In de Sitter space, we will set up the memory question in a manner that is now familiar, 
by studying the zero mode of a massive scalar, and perturbing the initial condition away 
from Bunch-Davies. Then we will ask whether the late-time statistics of the zero mode is 
correlated with the choice of initial condition. 

Of course, a free field in de Sitter is somewhat different from the EAdS and tree 
models we studied so far. It is a Lorentzian quantum mechanical system, subject to the 
constraint of unitarity. For free theories, this constraint is particularly strong, since the 
different modes decouple, and the evolution of each mode must be unitary. This means 
that information about the initial conditions for the zero mode can't get scrambled into 
phase correlations between various modes. Indeed, we'll see that Fourier components of 
free fields in de Sitter maintain memory of initial conditions for any allowed mass. Cosmic 
no-hair applies to local quantities in position space, but not to global quantities like Fourier 
modes. 

The metric for the flat slicing of de Sitter space is 

, 2 -drf + dx l dx % . , 

ds = — i = l,...,d, (41) 

T] 2 

where rj plays the role of conformal time and the future boundary is at rj = 0. As in 
EAdS, the wave equation has two branches, characterized by different falloff behaviors at 
late time ~ r] A+ and ~ r/ A - where the de Sitter dimensions are 

A ± = -(d± Vd 2 - 4m 2 ) . (42) 

The quantum state of the field at conformal time rj is described by a wave function 
^[^(x); 77]. We will specialize to the spatial zero mode, denoted <p(rj), and the corre- 
sponding wave function ^(^,77). Expectation values of functions of ip are computed using 
the weighting |\I/| 2 . Normally, in de Sitter space, one uses the Bunch-Davies ground state 
wave function, which is computed in appendix [A] To test for memory, we'll change the 
initial condition by constraining the initial value of the zero mode <p at some early time rj . 

Specifically, we'll take a wave packet at initial time ?/, and compute the wave function 
at late time 77 using the Gaussian propagator. 

^{frttV) = J d(pK{ip n ,wn,ri')ex-v{-A{ipv' ~ vf} ( 43 ) 

The propagator K can be computed by evaluating the action on the appropriate classical 
solution. Alternatively, it is simply the analytic continuation of the wave function tyyj 
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from the previous subsection. Up to normalization, K(ip v , ip; rj, rj') has the leading behavior 



-A_^ 2 ,/2A - d\ /r/\2A+-d 2 ./2A+ - d\ / 77 \ A + .A + 2 



"{'M^'IVJ^ * + < F?^J W-'^}. (44) 

First, let's take the almost massless case where A_ is real and less than d/2. Then, 
doing the Gaussian integrals and taking the modulus-squared, we find the probability 
distribution 

P(<p v ) ~exp{ - (<p v - r) A - p<p v r) 2 } (45) 



4A 2 + A^ V + ' ' 2A+-d' 
Here, to review the notation, <p v is the configuration at late time 77 — >- 0, and is the 
center of the Gaussian wave packet that defines the initial conditions at early time rj '. The 
key point is that the width of the distribution for (p v and the shift in the direction of ip v ' 
are both of order ~ r/ A - . This means that ip v remains significantly biased in the direction 
of the initial conditions, so the system has memory. 

The case of A± = d/2 and complex A± need to be treated separately. First, for 
complex A±, the real part is always d/2. A straightforward calculation similar to the 
above demonstrates that the shift oscillates with scale, but the decay of its envelope 
matches the decay of the width, and both are proportaional to rfl 2 . For A± = d/2, the 
shift and width are also matched, proportional to ~ rj d ^ 2 \ogrj. We see that a free massive 
scalar in dS has memory for all allowable values of A±. 

Even without the unitarity argument or the explicit calculation, the results for A_ < 
d/2 could have been anticipated from the analogy to the Ising model and the Kesten- 
Stigum lower bound. Free fields in de Sitter can be modeled by a stochastic Markov 
system [T5], and A_ < d/2 implies a second eigenvalue greater than the critical valuej^] 

2.5 CFT 

Since we defined memory in terms of fluctuation statistics near the boundary of (EA)dS, 
we should be able to translate the forgetful transition into CFT terms. In this section, 
we will do this in two related ways. First, we'll consider the statistics of certain sums of 
operators in the fixed point theory and argue that bulk memory is related to a failure of 



interacting theories in de Sitter can have a falloff faster than A_ = d/2 [TS]. It seems likely that, in 
such theories, memory of initial conditions is not recorded in the statistics of simple late-time quantities. 
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the central limit theorem. Second, we'll mock up modified bulk initial conditions as an 
RG transient and look for memory in the UV. In both cases, we'll find a sharp change of 
behavior at A = d/2. 

Consider, then, a CFT in d dimensions, and focus on a patch of volume L d , with a 
lattice cutoff at scale e. Choose an operator Oj with dimension Aj, and define the variable 

M e = Y,O t {x). (46) 

X 

This sum contains one term per lattice point in the patch, for a total of {L/e) d summands, 
so as the lattice becomes small, the quantity M e involves a very large number of operators. 
This quantity is analogous to the Ising magnetization at level u in the tree, with p~ u ~ 
e/L. Based on the second-eigenvalue condition, we expect the existence of memory to be 
connected with a transition as a function of Aj. Specifically, we expect that for Aj > d/2, 
small e will make the distribution P(M e ) Gaussian, while for Aj < d/2 the distribution 
will remain non-Gaussian in the limit of small e. 

To confirm this, we will inspect some moments of the distribution for M e . The two 
point function is 

(M*) = J2(0>(x)O t (y))- (47) 

We can use translation invariance (ignoring a small correction due to finite volume) to do 
one of the sums, 

( M ')^^E^(^(°))- ( 48 ) 

X 

First, suppose Aj > d/2. Then, the remaining sum is UV divergent, and the leading 
contribution is just the two point function at lattice scale. We normalize the operators so 
this is one, making the overall answer proportional to (L/e) d . 

Next, suppose Aj < d/2. Then the sum is dominated by the IR, so it can be approxi- 
mated as an integral 

f L d d x e 2A * L d - 2 ^ 

J ^~^2A~ ~ e d-2A t ■ 

This means the two point function is proportional to (L/e) 2d ~ 2A \ To get a nicely normal- 
ized quantity in the continuum limit, we should divide M e not by the square root of the 
number of points, but by a fractional power, M e (e/L) d ~ Ai . 
To go further, consider the fourth moment, 

<M £ 4 > = (O t (x)O t (y)O t (z)O t (w)). (49) 

x,y,z,w 
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This is a four-point function, which depends on the entire operator spectrum of the theory. 
However, for A, > d/2, it is dominated by UV singularities when two pairs of the operators 
approach each other. There are three ways the operators can pair up, so, for large L/e 

(M e 4 > =3<M e 2 ) (M 2 >. (50) 

This equation implies that the fourth order cumulant is zero, consistent with Gaussian 
statistics. Suppose, instead, that Aj < d/2. Then there are no coincident-point divergences 
in the sum. It follows from scaling that result has to be proportional to (L/e) 4d_4A \ The 
coefficient depends on the OPE constants and spectrum of the theory, so, in general, the 
fourth-order cumulant will be nonzero. 

Perhaps a more direct way to understand memory in a CFT is to study an RG transient, 
rather than focusing on the fixed point statistics. A related issue was considered in [20], 
and we will use a similar construction. Specifically, we will mock up the bulk transient by 
adding an operator Oi to the action, far from our patch of size L d ^\ Let us arrange the 
coefficient so that one point function of Oi at the center of the patch, and renormalized 
at scale L, is order one. Within the patch, this means that the one point function of the 
operator Oi(x) at the lattice scale is order (e/L) A , so the operator M t will have a one-point 
function of order (e/L) Ai ~ d . Comparing this to the variance of the distribution for M t in 
the unperturbed CFT, we find that the statistics of M e can detect the perturbation if 
A; < d/2, but not if A; > d/2. 

3 Configuration- space ultrametricity 

In this section, we will review the connection between memory and the existence of multiple 
extreme (pure) components in the Gibbs state. We will discuss the branching structure 
of the space of pure states for the Ising model. We will then switch to de Sitter, where 
Anninos and Denef argued that, similarly, the Bunch-Davies vacuum splits into a tree-like 
space of extreme states [T3]. We'll generalize their analysis to positive mass. 

7 We are not smearing the operator over the L d patch as in [50]; we are inserting the operator at a 
definite location far away. Had we smeared the operator, we would have found an effect that becomes 
large in the UV for irrelevant Oi, unlike the bulk transient we are trying to model. We are grateful to 
Stephen Shenker for a discussion of this point. 
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3.1 Ising model on a tree 

Mathematically, memory in the Ising model is connected to the existence of nontrivial 
variables "at infinity" in the tree. As an example, we can consider the appropriately 
normalized total magnetization at generation u. For A > 1/ y/p, the limit theorems of [8] 
(see [7] or [21] for explanation) establish that the variables 

(51) 



(pA)« 

converge to a random variable which is correlated with the spin at the root of the 
tree. Let us contrast this with the case in the no-memory phase A < 1/y/p- There, the 
marginal distributions for the variables 



pu/2 



(52) 



converge to a Gaussian with mean zero, but the variables themselves do not converge: in 
a given realization of the spin system, the variable M u would continue switching sign as a 
function of u. 

In probability theory, the concept of variables at infinity is formalized as the tail field, 
which is the set of observables in an infinite system that don't depend on the variables at 
any finite number of sites. A system is descrbed as having a trivial tail if every tail event 
has probability either zero or one. The existence, in the memory phase, of correlation 
between the variable and the root implies that the distribution for M m has finite 
width, so the tail field is nontrivial. This is equivalent (see [22], Theorem 7.7) to the 
statement that the free boundary Gibbs state is not an extreme point in the convex space 
of Gibbs statesJ3 

We would like to characterize the space of the extreme Gibbs states. For the Ising 
model, we can get a fairly complete picture of the space of these as follows. First, we 
divide up the ensemble according to whether the variable M m is positive or negative, in 
other words, we divide it up according to whether the reconstruction of the initial spin 
is up or down. This cleanly divides the set of extreme components that makes up Gibbs 
state into two components. We can repeat this procedure for tail variables corresponding 
to the total magnetization associated to the leaves of the subtrees emanating from the p 
children of the root. Focus on the p = 2 case for simplicity. The root has two children at 
level u — 1, which we'll label (1, 1) and (1,2). Let M^' 1 ^ and M^' 2 ' 1 denote the rescaled 



^Extreme (pure) states and their connection with spin glasses was recently reviewed in 
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magnetization at infinity for the subtrees associated to these vertices. If M m is positive, 
then there are three possibilities: both M^,' 1 ^ and M^' 2 ^ might be positive, or they could 
have opposite sign. Similarly, if M ra is negative, then both might be negative or they 
might have opposite signs. Proceeding in this way, we can further divide the Gibbs state 
according to the 2 pU possible signs of the p u variables M^' l \ i = l,...,p u , where 
is the total magnetization at infinity in the subtrees growing out of the z-th vertex at 
generation u. 

As a function of u, this decomposition defines a branching, RG-type evolution of the 
space of pure states. Because of the existence of memory, the sign of the variable M^' 1 ' is 
correlated with the i-th spin at generation u in the tree, so this evolution is related to the 
evolution of the configurations in the actual system. In the limit of small 7, this relation 
becomes exact, but at finite 7 it is approximate, since the reconstruction of the spin (u, i) 
from Ms£ can be wrong. 

3.2 Free fields in dS 

Anninos and Denef recently suggested a very similar extreme state decomposition for 
the Bunch-Davies vacuum associated to a massless field in de Sitter [13]. Related overlap 
distributions were also computed in [24J . Part of the motivation for invoking extreme states 
was the failure of the massless field to cluster; the two point correlator is logarithmically 
divergent at large distance. By contrast, the two point function of free massive fields 
clusters, so one might guess that the analysis of [13] is related to a pathology of the 
massless scalar. This is not the case. In the remainder of this section, we will compute 
overlap distributions for positive mass fields in de Sitter. We'll see that the ultrametric 
tendency of the massless field is shared by massive fields as long as A_ < d/4. 

3.2.1 Setup 



As in section |2.4[ we will work with de Sitter space in flat slicing, but we'll make the IR 
cutoff explicit by compactifying space on a torus of co moving size L, x l ~ x % + L. The 
probability distribution on spatial field configurations is given by the norm-squared of the 
wave function, |\l/| 2 . Following Anninos and Denef, we will define a distance on the space 
of these configurations, 

d(l,2) = -^ / d d x(<Mx)-0 2 (x)) 2 , (53) 
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where is obtained from by subtracting the zero mode and smearing over a comoving 
scale corresponding to a large but fixed number of horizons. It will be convenient to define 
a regulated distance 8(1, 2) by subtracting the mean, 

<f(l,2)=d(l,2)-<d(l,2)). (54) 

Of course, this subtracted distance can be either positive or negative. 

We will be interested in the following questions [13] : what is the probability, in the 
ensemble of fluctuations determined by that two independently chosen configurations 
have a given distance 61 Or that three configurations will have distances 61,82, 6%! Rather 
than computing the probability distribution for 8(1, 2) directly, it is easier to compute 
exponential moments (e -s<5 ( 1,2 )) over field configurations 0i and 02, as a function of s, and 
recover the distribution for 8 by inverse Laplace transform 

/ioo 
e s5 (e~ sS{l > 2) ) v ds. (55) 
-ioo 

The expectation value (•},, is done with respect to the measure l^l 2 at time rj, provided 
by the Bunch-Davies wave function. This wave function is computed in appendix [A] and 
the result, for superhorizon modes kr) -C 1, is 

I * (0, rj) 1 2 = M exp ( - 2 V) M 2 ) (56) 

k 

where M is a normalization factor, independent of 0, k — |k| and 

■ dtfd— 1 



/?(*, 77) ~ ^^ d - 2A - A_ = i ^ - ^j<P - 4mH%) . (57) 

The proportionality constant in the definition of (5 is an order one number that depends 
on raids and d. It is given in the appendix but won't be needed here. 

The wave function is a product over the different k modes, and the distance distribution 
is quadratic in 0i and 2 , so we can compute the expectation value e~ s<5 ^ 1 ' 2 ' ) by Gaussian 
integration, mode by mode. The computation is entirely parallel to the massless one 
detailed in [131, and the result is 



^=^Thm^ry (58) 



where the primed product runs over unordered pairs (k, — k) with k 7^ 0. Similarly, the 
probability distribution for the distances between three configurations is given by a three 
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dimensional Laplace transform of 

^ e -si<5(l,2)-s 2 5(l,3)-s 3 <5(2,3)^ 



nv 



k _ n - + (si + s 2 + s 3 )//3{k, i) + 3{ Sl s 2 + Sl s 3 + s 2 s 3 )/4(3{k, r/) 2 ' ^ 

If the mass is zero, then the Gaussian kernel (3 is independent of conformal time r\ and 
we can take the late-time limit safely. However, in the massive case, A_ > 0, and blows 
up at late time. This means that the exponential moments tend to zero for any fixed s, 
and the distance distribution collapses to a delta function. 

This is a reflection of the fact that massive fields in de Sitter space have a power law 
fade at late time, proportional to r] A - . We can compensate for this by defining a new, 
77-dependent metric on field configurations^ 



2A_ / A A \ 2 




n 



5 A (l,2) = d A (l,2)-(d A (l,2)). (60) 

The reason for the change in behavior of the normalization at A_ = d/4 will be made 
clear below, where we'll see explicitly that the above definition ensures that the width of 
the distribution for 5 A (1, 2) has a finite and nonzero late-time limit. In what follows, we'll 
adjust the constants of proportionality function of mass so that the variance is 

one. 

3.2.2 Ultra-light fields 

We'll begin by considering the case A_ < d/4, corresponding to a very small mass 
{m^ds) 2 < 3d 2 /16. In this mass range, the explicit power of conformal time in the def- 
inition of 5a(1,2) cancels the time-dependence of (3. Up to a constant multiple in the 
definition of 5a, which we fix by measuring distances in units of the variance, we have 



/ e -s«5 A (l,2)\ 



>=ir 1+ , /B ^- - («> 



_ + s/n 

n^O ' 

where the primed product runs over unordered pairs (n, — n), and n = |n|. As long as 
A_ < |, this product converges, and we are able to remove the smearing function that 
cuts off high momentum modes. A similar formula holds for the triple overlap. 



3 For A_ = d/4, an additional factor of l/\og(L/rj) is required in the normalization 
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As far as we know, this infinite product (61) is not known in closed form. One could 



approximate the product as the exponential of the integral of the logarithm, which can 
be evaluated in terms of known functions. But, even with this simplification, it doesn't 
seem possible to perform the Laplace transform to recover P(5), even in the saddle point 
approximation. 

However, even without doing the relevant integrals, it is clear that the distributions 
are continuous in A_ near 0, so that the overlap distributions for ultra-light fields match 
smoothly to the massless distributions as vni^s 0. This establishes that the ultrametric- 
ity of [13] extends to very small mass. 

Also, we can get a qualitative picture by studying the distributions numerically. The 
products are easy to compute, but the oscillatory Laplace transform is inconvenient. In- 
stead, we do the transform approximately by numerically searching for saddle points. As 
a first example, dSi + i probability distributions for a single distance 5^(1,2) are shown in 
Figure El For zero mass, we recover the Gumbel distribution of [13]. As the mass-squared 




Figure 4: Saddle point approximations of dSi+i overlap distributions, measured in units of 
the variance, for three different values of the mass. A_ — d/A corresponds to midS = 0.43. 

increases towards the critical value 3d 2 / 16, we see the lopsidedness fading as the Gumbel 
turns into a Gaussian. 

To check for ultrametricity, we need to evaluate the triple overlap distribution, P(5\, 5 2 , 5 3 ). 
The ultrametricity of [13] shows itself when one distance is small, by preferring that the 
other two distances be equal. To check for this behavior, we plot a conditional probability, 
P(5 | 2, —3) in Figure [5j If the distribution were truly ultrametric, this would be a delta 
function enforcing 5 = 2. And, indeed, for zero mass, the distribution is rather peaked 
near 5 = 2. As the mass increases from zero, the peak near 5 = 2 broadens and moves left 
towards some kind of non-ultrametric compromise between -3,0, and 2. 
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Figure 5: Saddle point approximations of dSi+i conditional probability P(S\2, — 3), mea- 
sured in units of the variance of the corresponding double overlap distributions, for three 
different values of the mass. Again, the critical mass is mids — 0.43. 



It is worth emphasizing that the the conditional probability plotted is very conditional, 
in the sense that the absolute probability for having any of the three distances equal to 
-3 is extremely small. This is apparent in Figure |4} To see the sharp ultrametric peaking, 
we are forced to evaluate the triple overlap distribution in a very rare region of parameter 
space. If, instead, we were to plot P(5\2, — 1), we would find little or no evidence of 
ultrametricity. 

3.2.3 Heavier fields 

We now turn to non-ultra-light fields, for which A_ > d/4. For such fields, the scaling 



factor in (60) has a different form. The reason is that the infinite product (61) diverges, so 
we can't naively remove the smearing cutoff on the field modes. Instead, we regulate the 
product with a comoving cutoff on momentum that is a large but fixed multiple of l/rj. 



With this prescription, one can check that the definition (60) ensures a finite late-time 
limit for the distance distribution. 

In fact, the calculation simplifies rather dramatically, because the product is dominated 
by the most ultraviolet modes. In the late-time limit, we find 

( e -rfA(l,2)) = e s 2 /2 ; ( 62 ) 

and a similar Gaussian formula for the moments of the triple-overlap distribution. The 
Laplace transforms are simple, giving P(5) as a Gaussian. Normalizing the variance to 
one, we find the triple-overlap distribution 

PA->d/*{6i, 6 2 , 63) = ^= exp {- h - (Sf + 51 + S 2 3 ) + ? {5 1 5 2 + S 1 S 3 + 5 2 S 3 )^j . (63) 
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The conditional probability for 8±, given 82 and 83 is a Gaussian, peaked at 8\ = 2 1 3 . 
In particular, if 5 2 = — 83, the conditional probability for 8\ is symmetric and peaked at 
zero. This lack of attraction towards the larger of the two other distances provides a sharp 
criterion for the absence of ultrametricity for A_ > d/A. Based on the first half of this 
paper, one might expect the transition to happen at A_ = d/2, not A_ = d/A. Apparently, 
memory is necessary for non-Gaussianity of the Anninos-Denef overlap distributions, but 
not sufficient F°1 

4 Conclusion 

Memory is defined as the existence of correlation between global variables at infinity and 
local variables at some finite point. A well-studied memory /forgetfulness transition hap- 
pens in the Ising model on the tree, suggesting a critical value of the dimension A = d/2 
for fluctuating dynamics in (EA)dS. Indeed, we found: 

• Free fields in EAdS with standard quantization have A > d/2 and forget pertur- 
bations deep in the bulk. However, with alternate quantization, the dimension can 
be less than d/2, and global variables at the boundary remain sensitive to such 
perturbations. 

• Free fields in dS never fall off faster than A = d/2. Despite local cosmic no-hair, 
global memory always exists. 

• The transition at d/2 can be understood in boundary CFT terms without assuming 
a free theory. 

We discussed the extreme states implied by the existence of memory in such systems. In de 
Sitter, we find that the ultrametric structure of these states persists to finite positive mass 
but disappears at A_ = d/A. Memory is necessary but not sufficient for ultrametricity. 
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A Super-horizon wave function in de Sitter space 

In this appendix, we review the calculation of the super-horizon wave function for a free 
massive field in the Bunch-Davies vacuum of de Sitter space, following [25] and [17] . The 
wave function \l/ depends on the spatial field 0(x), or its Fourier transform tp-^, at a given 
conformal time i] . It is given by a path integral over field configurations that satisfy a 
vacuum condition in the asymptotic past, and are equal to </>(x) at time tjq. As always in 
Gaussian theories, we can do this path integral by evaluating the action on the appropriate 
solution 0(x, rj) of the equations of motion, 

^% o ,0(x)] OC e -W(x,r,)]_ (64) 

As the in the main body of the paper, we'll work in flat slicing, with metric 

ds 2 = f dS ~ dT] * ! + cteW i = l,...,d, (65) 

with £ dS the de Sitter radius and — oo < r\ < 0. With this metric, the action for a free 
massive scalar is 



I"(^f-mV|. (66) 



-dS ^dS 

As before, we will introduce an explicit IR cutoff by making the identification x l ~ x l + L. 
This allows us to decompose the field into spatial Fourier components, (p(rj, x) = y?k(??)e* k ' x 
with quantized k = The equations of motion decouple into equations for each k 

qfck " ^<V k + + ^ = 0. (67) 

This differential equation is related to Bessel's equation and has the general solution 

Mv) = V d/2 {AiH^(k V ) + A 2 H^\k v )) (68) 
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d 2 -Am?e dS = -- A_, (69) 



where ifi (fcr?) and f/i (^) are Hankel functions. 

One linear combination of A\ and A 2 is fixed by requiring that at time rj , the solu- 
tion should be equal to the Fourier transform of the argument of the wave function, ip^. 
Fixing the other linear combination amounts to making a choice of vacuum. We pick 
the Bunch-Davies vacuum [26], also known as Hartle-Hawking [27], Eulidean or adiabatic. 
The prescription is to choose a solution that's purely positive frequency in the asymptotic 
past, r) — oo. This condition is made simple by the nice asymptotic properties of the 
Hankel function, 

lim Hl l \x) ~ e ix 

\x\—>oo 

lim H^\x) ~ e~ ix . (70) 

\x\—>oo 

We recognize the latter as the positive frequency modes, so the correct solution is 

All that remains is to substitute this solution into the action, mode by mode. We can 
integrate by parts and use the fact that ( |71~| satisfies the equations of motion to reduce 
the rj integral to a boundary term at rfop] 

S d = J2 ^V^) • (72) 

* , v=vo 

k 

We can evaluate the derivative using a Hankel function identity 

n t \ ( d —^_ u_ t *W^ ^ 
o^kW = — n 1- « — ^ V^k- (73) 

We are interested in the wave function for superhorizon modes, for which kr]o is much less 
than one, and the Hankel functions can be expanded as 

Hl 2 \x) ps A(u)x" + 5(z/)x~ v (ar < 1) (74) 
., . 1 — icotz/7r , 2^r[z/| 



2T [l + I/] W 7T 



11 A priori, there is also an oscillatory contribution from rj — > — oo. We kill this piece in the usual way, by 
rotating the contour for 77 slightly. The condition that the early-time mode is positive frequency ensures 
that this contribution is exponentially suppressed at early imaginary time. 
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At this point, it is useful to focus on the square of the wave function, \^\ 2 . This 

allows us to discard real terms in the action, since they contribute only a phase to e~ lScl . 
Using the expansion above, the assumption v > and Euler's reflection formula for the T 
function, we find 

Re(-iS d ) = -LXs 1 £ ( ^~^f^ sin ™> ( 75 ) 

so that, finally, 

|^| 2 ~ exp { - 2j2(3(k,Vo)\^\ 2 } (76) 

k 

a n \ r[i-i/] fL d efe 1 k*'\ , n , 
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